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The Digichaint interactive game 

as a virtual learning environment for Irish 
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Abstract. Although Text-To- Speech (TTS) synthesis has been little used in 
Computer-Assisted Language Learning (CALL), it is ripe for deployment, 
particularly for minority and endangered languages, where learners have little 
access to native speaker models and where few genuinely interactive and engaging 
teaching/leaming materials are available. These considerations lie behind the 
development of Digichaint, an interactive language learning game which uses 
ABAIR Irish TTS voices. It provides a language-rich learning environment for Irish 
language pedagogy and is also used as a testbed to evaluate the intelligibility, quality 
and attractiveness of the ABAIR synthetic voices. 
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1. introduction 

This paper describes the development and some of the evaluations carried out of 
a prototype interactive platform for Irish language learning, Digichaint, which 
uses TTS voices developed within the ABAIR initiative (www.abair.ie) at Trinity 
College, Dublin. Digichaint is one of three distinct prototype CALL platforms 
(see Ni Chiarain & Ni Chasaide, 2015, 2016) aligned to current task-based 
language learning/teaching principles, where (incorporating TTS) the spoken 
language is central. Digichaint explores the potential of interactive speech-based 
games for Irish language pedagogy and serves as a testbed for evaluating the 
newly developed TTS voices. 

Digichaint was adapted from The Language Trap (Peirce & Wade, 2010), an 
online casual educational game for teaching German to students preparing for the 
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Irish pre-university examinations, using diphone synthesis. Digichaint used The 
Language Trap graphics and development framework to design a game suited to a 
similar cohort of Irish language learners. 


2. Motivation 

2.1. The Irish language context 

Using synthetic voices in interactive learning games may be far more important 
to the pedagogy of an endangered minority language like Irish, than of a majority 
language. Irish is spoken as a community language only in limited Gaeltacht 
regions in the West of Ireland, but as the country’s first national language, is a 
compulsory subject taught to school leaving age. One major challenge with the 
teaching of Irish is the lack of exposure to native speaker models (most teachers 
are L2 speakers), and there has tended to be an overemphasis on written and 
grammatical competence. 

A further major problem concerns motivation. The dearth of modern pedagogical 
resources makes it difficult to engage the learner. It is clear that the educational 
process is important to the long-term survival of the language - not only in terms 
of its transmission through teaching, but also in fostering engagement with the 
language. The synthesis-based CALL applications being piloted could contribute, 
not only in facilitating more extensive exposure to the spoken language and in 
developing aural/oral skills, but should also help to engage learners, complementing 
current classroom practices. 

2.2. TTS synthetic voices in CALL 

TTS has not been widely used to date in CALL (Gupta & Schulze, 2012). As 
mentioned, the need is not as great in the major languages, given the widespread 
availability of native speaker models. The lack of TTS takeup probably also reflects 
the fact that many systems yield relatively poor quality speech output, particularly 
in terms of prosody, clarity and consistency (Sha, 2010). Evaluations on the use of 
synthetic speech for CALL purposes are scant, pertain to its use in rather restricted 
settings, and do not include the gaming environments considered here. In the 
case of the Irish voices, there has been no formal evaluation to date. Therefore, in 
Digichaint, voices for two main dialects (Connaught and Ulster) are incorporated 
and evaluated for intelligibility, quality and attractiveness 
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3. Structure and principal features of Digichaint 

Digichaint is an interactive guided dialogue that allows students to progress 
through a virtual world of a hotel and its surroundings. The learner selects the 
gender/dialect for their own character - male: Connaught Irish / female: Ulster 
Irish - which were the only choices available in ABAIR at the time. The learner is 
tasked with seeking the missing half of his/her winning Lottery ticket, mistakingly 
discarded, but held by one of eight characters in the hotel. To converse with other 
characters the user selects phrases from a menu of up to four possible options 
shown on the screen and spoken aloud (Figure 1). The goal is not to reveal one’s 
true purpose to avoid being double-crossed. When the holder of the other half of 
the ticket is eventually identified, the learner must negotiate how the winnings are 
to be split. The game can take a great number of pathways as the user controls who 
to speak to at any given point: the choice of conversational turn determines the 
subsequent options (868 utterances were created for the game). A fragment of the 
game’s structure is shown in Figure 2. The game lasts approximately 25 minutes. 

Figure 1. Screenshot from the virtual learning environment Digichaint 



Given only two baseline voices in AB AIR, a major challenge was to provide for the 
extended cast of the game. To differentiate voices, pitch and speed manipulations 
were carried out. Some manipulations rendered the voice somewhat sinister, 
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however. While this did not harm the story’s narrative, it does appear to impact on 
the TTS evaluations (see Results and discussion below). 

As mentioned in Ni Chiarain (2014), 

“[t]he game features linguistic adaptivity[, i.e.] the language level of the 
game adapts to the user’s language level: as the user chooses more complex 
structures, the options on offer become more complex, accordingly. 
Performance feedback and motivational support are provided through 
a particular companion character in the game who, when requested, will 
tell the player that his/her selections are excellent, good, poor, etc. Meta- 
cognitive hints on how well the player is doing appear as thought bubbles 
linked to the main character” (p. 83, emphasis added). 

115 dictionary entries, which give the English translation of particular words and 
phrases, can be accessed by clicking on underlined words in the text. Learners 
receive feedback at the end on their path through the game, their star rating, words/ 
phrases they looked up in the dictionary, etc., and this information is retained for 
future revision. 

Figure 2. Overview visual representation of a section of Digichaint illustrating 
multiple pathways 


german_guest manager companion german_waiter german_guest3 german_guest2 darkman double agent G 
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4. Evaluation 

Evaluation of the TTS voices was carried out online by 250 16-17 year old pupils 
(182 female, 68 male) in 13 schools nationwide: these included Gaeltacht (rural), 
Irish-medium (urban), and English-medium schools (both rural and urban). A pre- 
game questionnaire elicited background details on individual respondents. Pupils 
then played the game and gave reactions by way of a post-game questionnaire 
(Likert 5-point scale). 


5. Results and discussion 

Pupils’ opinions on the TTS voices were elicited in terms of the five questions 
(Ni Chiarain, 2014) listed in Table 1. Overall, responses to the quality of the 
voices were positive. 70% agreed/agreed completely that the language level was 
right (Ql): as pupils differed widely in proficiency one could expect their level to 
affect intelligibility ratings. Q2 sought to establish specific difficulty with dialect 
variation, and surprisingly low numbers reported difficulty. Intelligibility ratings 
(Q3) were broadly positive: 56% agreed/agreed completely that the voices were 
sufficiently clear to make the speech intelligible (as against 28% disagree/disagree 
completely). Attractiveness ratings (Q4) were rather low, with only 43% rated as 
attractive/very attractive (what might be considered ‘attractive’ was left open). 
As some characters had distorted voice quality the low rating was expected. The 
quality ratings (Q5) are reasonably high at 62%, although the inclusion here too 
of distorted voices has impacted. The ratings for attractiveness and quality, and 
even intelligibility, must be interpreted in conjunction with responses to the other 
two platforms (not covered here) where no voice distortions were included and 
ratings were considerably higher: intelligibility and quality both scored 73% and 
attractiveness scored 57% (Ni Chiarain, 2014). 


Table 1. Questions and results for Digichaint evaluation 


Ql.l The overall standard of the Irish used in this 
game is at about the right level for me 

Completely disagree 

4 

1.6% 

Disagree 

40 

16% 

Neutral 

30 

12% 

Agree 

130 

52% 

Agree completely 

46 

18.4% 

Q1.2 If you feel the Irish used is not at the right level, is this because it was... 

Too difficult 

54 

48.6% 
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6. Conclusions 

Bearing this in mind, there is broadly positive support for the use of the Irish 
TTS voices in such interactive platforms. Note that evaluations of TTS are highly 
specific to the quality of the individual voices and there is great variability across 
systems. Evaluations capture a point in time: even since these tests, the Irish voices 
have been improved and we expect that a similar evaluation would now yield 
higher ratings. Importantly, we now know that the voices are adequate for this 
application and we now have an evaluation method that will serve for testing the 
TTS voices as development continues. 

Furthermore, evaluations of synthesis quality are relative to the context of evaluation. 
When proposing to deploy TTS in CALL platforms, evaluations should be carried 
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out using multiple real-life platforms and real users, rather than relying on laboratory- 
based, decontextualized evaluations, as are the norm in TTS evaluation. 
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